Term Distillation In Patent Retrieval

نویسندگان

  • Hideo Itoh
  • Hiroko Mano
  • Yasushi Ogawa
چکیده

In cross-database retrieval, the domain of queries di ers from that of the retrieval target in the distribution of term occurrences. This causes incorrect term weighting in the retrieval system which assigns to each term a retrieval weight based on the distribution of term occurrences. To resolve the problem, we propose \term distillation", a framework for query term selection in cross-database retrieval. The experiments using the NTCIR-3 patent retrieval test collection demonstrate that term distillation is e ective for cross-database retrieval.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Term Distillation for Cross-DB Retrieval

In cross-DB retrieval, the domain of queries differs from the retrieval target in the distribution of that of term occurrences. This causes incorrect term weighting in the retrieval system which assigns to each term a retrieval weight based on the distribution of term occurrences. To resolve the problem, we propose \term distillation" which is a framework for query term selection in crossDB ret...

متن کامل

NTCIR-5 Patent Retrieval Experiments at Hitachi

In NTCIR-5, we used five retrieval methods proposed in NTCIR-4: (1) query term weighting using only document frequency, (2) stopword deletion, (3) two-stage patent retrieval, (4) term weighting considering “measurement terms”, and (5) related term expansion. In this paper, we compare the retrieval accuracy for two test sets: 34 main queries in NTCIR-4 and 1189 new queries in NTCIR-5. Then, we e...

متن کامل

Invalidity Patent Search System of NTT DATA

In this paper, we give an overview of our invalidity patent search system for NTCIR-4 PATENT. The system is based on the document retrieval technique and the new methods that are suitable for the invalidity search; the query term extraction based on characteristics of invention, the retrieval model using components of invention, the ranking using the term weighting based on category information...

متن کامل

Test Collections for Patent Retrieval and Patent Classification in the Fifth NTCIR Workshop

This paper describes the test collections produced for the Patent Retrieval Task in the Fifth NTCIR Workshop. We performed the invalidity search task, in which each participant group searches a patent collection for the patents that can invalidate the demand in an existing claim. For this purpose, we performed both document and passage retrieval tasks. We also performed the automatic patent cla...

متن کامل

Experiments on Patent Retrieval at NTCIR-4 Workshop

In the Patent Retrieval Task in NTCIR-4 Workshop, the search topic is the claim in a patent document, so we use the claim text and the IPC information for the similarity calculations between the search topic and each patent document in the collection. We examined the effectiveness of the similarity measure between IPCs and the term weighting for the occurrence positions of the keyword attribute...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003